54 research outputs found

    Toward the automated generation of genome-scale metabolic networks in the SEED

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process.</p> <p>Results</p> <p>We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for <it>Staphylococcus aureus</it>. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for <it>S. aureus</it>, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (<it>Escherichia coli</it>, <it>Helicobacter pylori</it>, and <it>Lactococcus lactis</it>). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis.</p> <p>Conclusion</p> <p>Our method sets the stage for the automated generation of substantially complete metabolic networks for over 400 complete genome sequences currently in the SEED. With each genome that is processed using our tools, the database of common components grows to cover more of the diversity of metabolic pathways. This increases the likelihood that components of reaction networks for subsequently processed genomes can be retrieved from the database, rather than assembled and verified manually.</p

    Identification and Analysis of Bacterial Genomic Metabolic Signatures

    Get PDF
    With continued rapid growth in the number and quality of fully sequenced and accurately annotated bacterial genomes, we have unprecedented opportunities to understand metabolic diversity. We selected 101 diverse and representative completely sequenced bacteria and implemented a manual curation effort to identify 846 unique metabolic variants present in these bacteria. The presence or absence of these variants act as a metabolic signature for each of the bacteria, which can then be used to understand similarities and differences between and across bacterial groups. We propose a novel and robust method of summarizing metabolic diversity using metabolic signatures and use this method to generate a metabolic tree, clustering metabolically similar organisms. Resulting analysis of the metabolic tree confirms strong associations with well-established biological results along with direct insight into particular metabolic variants which are most predictive of metabolic diversity. The positive results of this manual cu ration effort and novel method development suggest that future work is needed to further expand the set of bacteria to which this approach is applied and use the resulting tree to test broad questions about metabolic diversity and complexity across the bacterial tree of life

    Improvements to Bayesian Gene Activity State Estimation from Genome-Wide Transcriptomics Data

    Get PDF
    An important question in many biological applications, is to estimate or classify gene activity states (active or inactive) based on genome-wide transcriptomics data. Recently, we proposed a Bayesian method, titled MultiMM, which showed superior results compared to existing methods. In short, MultiMM performed better than existing methods on both simulated and real gene expression data, confirming well-known biological results and yielding better agreement with fluxomics data. Despite these promising results, MultiMM has numerous limitations. First, MultiMM leverages co-regulatory models to improve activity state estimates, but information about co-regulation is incorporated in a manner that assumes that networks are known with certainty. Second, MultiMM assumes that genes that change states in the dataset can be distinguished with certainty from those that remain in one state. Third, the model can be sensitive to extreme measures (outliers) of gene expression. In this manuscript, we propose a modified Bayesian approach, which addresses these three limitations by improving outlier handling and by explicitly modeling network and other uncertainty yielding improved gene activity state estimates when compared to MultiMM

    Evaluating the Consistency of Gene Sets Used in the Analysis of Bacterial Gene Expression Data

    Get PDF
    Background Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. Results We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Conclusions Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses such data. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data

    Implementing and Evaluating a Gaussian Mixture Framework for Identifying Gene Function from TnSeq Data

    Get PDF
    The rapid acceleration of microbial genome sequencing increases opportunities to understand bacterial gene function. Unfortunately, only a small proportion of genes have been studied. Recently, TnSeq has been proposed as a cost-effective, highly reliable approach to predict gene functions as a response to changes in a cell\u27s fitness before-after genomic changes. However, major questions remain about how to best determine whether an observed quantitative change in fitness represents a meaningful change. To address the limitation, we develop a Gaussian mixture model framework for classifying gene function from TnSeq experiments. In order to implement the mixture model, we present the Expectation-Maximization algorithm and a hierarchical Bayesian model sampled using Stan\u27s Hamiltonian Monte-Carlo sampler. We compare these implementations against the frequentist method used in current TnSeq literature. From simulations and real data produced by E.coli TnSeq experiments, we show that the Bayesian implementation of the Gaussian mixture framework provides the most consistent classification results

    CytoSEED: a Cytoscape plugin for viewing, manipulating and analyzing metabolic models created by the Model SEED

    Get PDF
    Summary: CytoSEED is a Cytoscape plugin for viewing, manipulating and analyzing metabolic models created using the Model SEED. The CytoSEED plugin enables users of the Model SEED to create informative visualizations of the reaction networks generated for their organisms of interest. These visualizations are useful for understanding organism-specific biochemistry and for highlighting the results of flux variability analysis experiments

    Gene set analyses for interpreting microarray experiments on prokaryotic organisms

    Get PDF
    Background Despite the widespread usage of DNA microarrays, questions remain about how best to interpret the wealth of gene-by-gene transcriptional levels that they measure. Recently, methods have been proposed which use biologically defined sets of genes in interpretation, instead of examining results gene-by-gene. Despite a serious limitation, a method based on Fisher\u27s exact test remains one of the few plausible options for gene set analysis when an experiment has few replicates, as is typically the case for prokaryotes. Results We extend five methods of gene set analysis from use on experiments with multiple replicates, for use on experiments with few replicates. We then use simulated and real data to compare these methods with each other and with the Fisher\u27s exact test (FET) method. As a result of the simulation we find that a method named MAXMEAN-NR, maintains the nominal rate of false positive findings (type I error rate) while offering good statistical power and robustness to a variety of gene set distributions for set sizes of at least 10. Other methods (ABSSUM-NR or SUM-NR) are shown to be powerful for set sizes less than 10. Analysis of three sets of experimental data shows similar results. Furthermore, the MAXMEAN-NR method is shown to be able to detect biologically relevant sets as significant, when other methods (including FET) cannot. We also find that the popular GSEA-NR method performs poorly when compared to MAXMEAN-NR. Conclusion MAXMEAN-NR is a method of gene set analysis for experiments with few replicates, as is common for prokaryotes. Results of simulation and real data analysis suggest that the MAXMEAN-NR method offers increased robustness and biological relevance of findings as compared to FET and other methods, while maintaining the nominal type I error rate

    Development and implementation of a physician-pharmacist collaborative practice model for provision and management of buprenorphine/naloxone

    Get PDF
    Introduction: Physician-pharmacist collaborative practice models (PPCPM) decrease barriers and increase access to medications for opioid use disorder (MOUD) but are not routine in practice. The purpose of this quality improvement initiative is to develop and implement a PPCPM for management of patients on MOUD with buprenorphine/naloxone to minimize provider burden, expand access to treatment, and enhance overall patient care. Methods: A PPCPM for management of patients on MOUD with buprenorphine/naloxone was piloted in an outpatient substance use disorder clinic. Approximately 4 hours per week were dedicated to physician-pharmacist collaborative medical appointments for a 5-month trial period. The pharmacist met with the patient first and then staffed the case with the collaborating psychiatrist. Descriptive data from PPCPM appointments was collected and compared to data from psychiatrist-only appointments. Results: Twenty-five patients were seen over 44 appointments with an estimated 33 hours of psychiatrist time saved. Average initial and end buprenorphine doses, urine drug screen (UDS) results, and mental health (MH) medication interventions were similar between patients seen in PPCPM appointments compared with those seen in psychiatrist-only appointments. Collection of UDS, identification and management of MOUD adherence issues, other service referrals, and medication reconciliation intervention were more frequent in PPCPM appointments. Discussion: Implementation of a PPCPM allowed for provision of a similar level of care regarding MOUD and MH-related medication management while saving psychiatrist time. Other enhancements to patient care provided through pharmacist intervention included more frequent identification and management of MOUD adherence issues, referral for other services, and medication reconciliation interventions
    • …
    corecore